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Abstract 


Microsoft Excel lacks a built-in function for calculation of Studentized range quantiles. 
Having these quantiles (Q) available in Excel is an advantage as it is easy to perform 
comparisons of means in post hoc tests as part of analysis of variance. Thus, an accurate, 
fast, and easily implemented user-defined function was developed for calculation of these 
quantiles. The developed procedure was about 200 times faster than other accurate Excel 
methods available and provided very low error rates over a wide range of probabilities. 
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Introduction 

Experimentalists often engage in comparisons of means obtained under various treatments. Multiple 
comparison procedures (MCP's) are often carried out by statistical software packages that are less 
suitable for data storage and manipulation and may also require the user to learn a new computer 
programming skill (Kramer et al. (2016)). Microsoft Excel offers excellent data storage and manipulation 
but is very limited when it comes to statistical data analysis. The availability of Excel functions such as 
FDIST, FINV, TDIST, and TINV for the F- and f-distributions allows for some capability to conduct simple 
MCP's such as Fisher's Least Significant Difference (LSD) and the Scheffe tests. Flowever, the lack of 
support for the Studentized range (Q) distribution does not allow for the Tukey Flonest Significant 
Difference (HSD), Student-Newman-Keuls (SNK), or Ryan-Einot-Gabriel-Welsch Q (REGWQ) tests to be 
carried out. A comprehensive review of different MCP methods may be found in Day and Quinn (1989) 
who described, provided equations for, and evaluated many of the commonly used methods. While this 
reference is somewhat dated, it is an excellent review and is still cited in recent works (Franklin et al. 
(2018); Schroeder-Spain et al. (2018)). 

Groups receiving different treatments and the comparison of those means is a post-hoc analysis that 
often takes place after a general analysis of variance (ANOVA) evaluation. The methods by which means 
are compared differ depending on factors such as planned versus unplanned comparisons, comparisons 
between groups of the same or different sizes, comparisons of groups with or without equal variances, 
and parametric versus stepwise comparisons. Flowever, they have in common the use of statistical 
distributions such as the F-, t-, or Q-distributions. It should be noted that Day and Quinn (1989), in 
summary, recommended MCP methods that use the Q-distribution such as the Tukey HSD, REGWQ, and 
Games-Flowell tests. As formulas for this distribution are not available in Excel, it was logical to make a 
simple user-defined function and implement Studentized range quantiles (Q) in Excel. 
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Within this manuscript, the term alpha (a) is used to represent the upper percentile of the Studentized 
range distribution; a is typically in the range of 0.1 to 0.01 in comparisons. It should be noted that many 
statistical tables of the Studentized range quantiles (Q) use one minus a (often denoted by p) to identify 
a table. However, as the upper (rather than lower) percentile (a) is used in several other Excel functions 
(e.g., FINV and TINV), a was chosen as parameter for the user-defined function. The letter r will be used 
to denote the total number of groups in the overall experiment and v to denote the degrees of freedom 
within groups (available in the standard one-way ANOVA table.) 

As with many statistical properties related to distributions, statistical tables based on Studentized ranges 
have long been part of statistical text books but they tend to be limited to a few probabilities (e.g., p = 
0.95, a = 0.05) (Snedecor and Cochran (1967); Walpole and Myers (1989)). For an extensive set of Q 
tables, see Harter (1960) and Gleason (1998). Lund and Lund (1983) developed a numerical integration 
algorithm (AS 190) to estimate Q values for a limited range of a-values (0.10-0.01) and also included a 
rough estimate algebraic algorithm (AS 190.2) for a-values of 0.20-0.05. For a greater range of a, 
Copenhaver and Holland (1988) developed a Fortran algorithm using numerical integration based on 
Gauss-Legrendre quadrature. The latter algorithm is used by R Statistical Software (Team (2016)) and 
other statistical software packages. It is, presumably, also the algorithm used by of the Excel add-in 
RealStats-2007, which is freely available but the algorithm is not viewable (Zaiontz (2007)). While the 
Copenhaver-Holland iterative method is the most accurate, it requires significant computer processing 
time and it would be beneficial to have other options in Excel. 


Theory and calculations 

This manuscript took a non-iterative approach by improving the published AS 190.2 algorithm. 
Algorithm AS 190.2 (called QTRNGO) was part of algorithm AS 190 (Lund and Lund (1983)) and is only a 
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few lines long. Expressed as an Excel user-defined function, the Excel Visual Basic code for QTRNGO is 
shown in Figure 1. 


Function QTRNGO(a, r, v) As Single 
'Algorithm AS 190.2, adapted from 

'R. E. Lund and 3. R. Lund, Algorithm AS 190: Probabilities and upper 
'quantiles for the studentized range, Journal of the Royal Statistical Society. 
'Series C (Applied Statistics) 32 (1983), no. 2, 204-210. 

Dim T, Q As Single 

T = Application.WorksheetFunction.NormSInv(0.5 + 0.5 * (1 - a)) 

If v < 120 Then 

T = T+(TA3 + T)/v/ 4 
End If 

Q = 0.8843 - 0.2368 * T 
If v < 120 Then 

Q = Q - 1.214 / V + 1.208 * T / V 
End If 

QTRNGO = T * (Q * Log(r - 1) + 1.4142) 

End Function 


Figure 1. Algorithm AS 190.2 (QTRNGO) adapted as a user-defined function in Excel. According to Lund 
and Lund (1983) this algorithm is suitable for a between 0.01 and 0.2, excluding v = 2. 


In order to improve the accuracy of AS 190.2, the overall equation for Q in AS 190.2 was expressed as 

(1) Q = T[ln(r-l)-(0.8843 - 0.2368 T- 1.214-V 1 + 1.208T-V 1 ) + 2 05 ] , 

where 7” is a function of a and v with the acknowledgement that Tis the equivalent of Excel's TINV(a,v) 
function. Equation 1 can be rearranged and generalized as 

(2) I N = CrZi + C r Z 2 + C 3 -Z 3 + C 4 -Z 4 , 

where 

W = Q/T— 2 05 , Zi = In(r-l) , Z 2 = Tln(r-l) , Z 3 = v _1 -ln(r-l) , and Z 4 = Tv _1 -ln(r-l) . 

Rather than fitting this equation over a large range of a-values, the equation was fitted to selected 
values of a which were the same as those published by Gleason (1998). Gleason tabulated Q for a = 0.5, 
0.25, 0.1, 0.05, 0.025, 0.01, 0.005, 0.001 when r = 2(1)20(10)40(20)100 and v = 2(1)20, 24, 30, 40, 60, 
120, oo. A typical such table is shown in Figure 2. 
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Figure 2. Q-values as tabulated by Gleason (1998) for a = 0.05 with upper right corner removed. 
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Using the values below the diagonal of the tables, the coefficients of regression Ci, C 2 , C 3 , and C 4 in 


Equation 2 can be determined for each value of a using multiple variable least square regression. As 
shown in Figure 2, only part of the tables were used because it is unlikely that v is less than r in practical 
situations. For example, consider the comparisons of r groups which has >1 replicates. In the case of 2 
replicates, v would take on a value of r+1. In fact, as long as all groups contain replicates, v is always 
greater than r. The results of the regression are shown in Table 1. 

Table 1. Coefficients of regression (rounded to 4 decimal points) for Equation 2 for several a-values. 


a 

Ci 

c 2 

c 3 

C 4 

r 2 

0.5 

392.7272 

-580.2495 

39.0825 

154.6929 

0.9998 

0.25 

157.2291 

-136.0289 

10.6536 

70.9547 

0.9998 

0.1 

40.5373 

-24.3509 

6.2709 

19.6300 
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16.8278 

-8.3896 

4.9942 

8.3570 
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-3.3642 

4.0326 

3.9286 
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3.3294 

-1.1908 

3.1538 

1.6292 
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1.9070 

-0.5972 

2.7118 

0.8964 
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0.001 

0.6344 

-0.1380 

2.0948 

0.2371 

0.9997 


As is noted in Table 1, the correlation was 0.9997, or better, for each a suggesting very good fit by the 
relationship in Equation 2. Now, it would be beneficial to evaluate Q at a-values other than those listed 
in Table 1. It is not recommended to directly interpolate Q in a (Harter (I960)) but Gleason (1999) 
offered a method of indirect interpolation of Q in a by the correlation 

(3) Y= m i + m 2 X + m 3 -X 2 , 

where 
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(4, 5) Y= \n(Q.a, r y2~ 05 -T a/ v~ 1 + r-v - 1 ) and X = -(1 + 1-5 -N^Y 1 . 

The entities T a ,„ and Ni . a /2 are equal to Excel's functions TINV(a,v) and NORMSINV(l-a/2). Thus, with a 
known r- and v-value, we can calculate Y- and X-values for each of the a-values in Table 1. This will allow 
us to determine mi, m 2 , and m 3 in Equation 3. With these constants and the X-value for the desired a 
(via Equation 5), we can calculate the T-value through Equation 3 which allow us to back-calculate the Q- 
value via Equation 4. This method of interpolation can also be used for extrapolation for a-values below 
those listed in Table 1 (Gleason (1999)). When the method for calculating Q was initially tested, the 
results indicated that when v > 100 and a > 0.025 the method under-estimated the Q-value with 0.7%. 
This small adjustment was added to the computer code. In addition, the error of the calculated Q-values 
followed a complicated pattern as a function of r but it appeared that the errors for r > 20 could be 
addressed, as the error was linear with r. This was found true for all a-values and the slopes and 
intercepts were linearly dependent on ln(a). Thus, this small correction term was also added to the 
computer code to adjust Q-values when r > 20. The final Excel Visual Basic code for the user defined 
function QTAB(a,r,v) is shown in Figure 3. 


6 



K. Thomas Klasson, A Simple but Accurate Excel User-Defined Function to Calculate Studentized Range 
Quantiles for Tukey Tests 


Public Function QTAB(a, r, v) As Single 
Dim i, j, k, Done As Integer 

Dim alpha(8), m(3, 4), X(8) , Y(8), C(4, 8), Q As Single 

'If outside range of use, report #VALUE error and quit 
If r < 2 Or r > 100 Or v < r Or a < 0.0001 Or a > 0.5 Then 
QTAB =1/0 
GoTo Done 
End If 


alpha(l) =0.5: C(l, 1) = 392.727161: C(2, 1) = -580.249479: C(3, 1) = 39.08247679: C(4, 1) = 154.6928606 
alpha(2) = 0.25: C(l, 2) = 157.2290812: C(2, 2) = -136.0288671: C(3, 2) = 10.65361764: C(4, 2) = 70.95473271 
alpha(3) = 0.1: C(l, 3) = 40.53727: C(2, 3) = -24.35088143: C(3, 3) = 6.270918133: C(4, 3) = 19.63004347 

alpha(4) = 0.05: C(l, 4) = 16.82778272: C(2, 4) = -8.38960093: C(3, 4) = 4.994168622: C(4, 4) = 8.357020633 

alpha(5) = 0.025: C(l, 5) = 7.860314249: C(2, 5) = -3.364167121: C(3, 5) = 4.032604321: C(4, 5) = 3.928646444 
alpha(6) = 0.01: C(l, 6) = 3.329358: C(2, 6) = -1.190778187: C(3, 6) = 3.153807539: C(4, 6) = 1.629208317 
alpha(7) = 0.005: C(l, 7) = 1.907028105: C(2, 7) = -0.597201494: C(3, 7) = 2.7117957: C(4, 7) = 0.896448643 

alpha(8) = 0.001: C(l, 8) = 0.634353634: C(2, 8) = -0.137994025: C(3, 8) = 2.094748: C(4, 8) = 0.237087945 

'Check to see if a is equal to one of the standard alpha values 
j = 0: Done = 0 
For i = 1 To 8 

If a = alpha(i) Then 
j = i: Done = 1 
End If 
Next i 

'If we do not need interpolation/extrapolation in alpha, calculate the Q-value for a=alpha(j) and exit 
If Done = 1 Then 

QTAB = Appl ication.WorksheetFunction.Tinv(a, v) * (Sqr(2) + Log(r - 1) * (C(l, j) + C(3, j) / v) + _ 

Application.WorksheetFunetion.Tinv(a, v) * LogCr - 1) * (C(2, j) + C(4, j) / v)) 

GoTo Done: 

End If 

'INTERPOLATION AND EXTRAPOLATION IN alpha 

'For each alpha value, calculate Y- and x-values which will be used to interpolate/extrapolate 
For i = 1 To 8 

Q = Appl i cation.WorksheetFunction.TInv(alpha(i), v) * (Sqr(2) + Log(r - 1) * (C(l, i) + C(3, i) / v) + _ 
Appl icati on.WorksheetFunction.TInvCalpha(i), v) * LogCr - 1) * (C(2, i) + C(4, i) / v)) 

Y(i) = Log(Q / Sqr(2) / Application.WorksheetFunction.Tlnv(alpha(i), v) + r / v) 

X(i) = -1 / (1 + 1.5 * Application.WorksheetFunction.NormSInv(l - alpha(i) / 2)) 

Next i 

'Form the matrix that is needed to find coefficients of regression in quadratic equation with Y=fn(X) 

For i = 1 To 8 

m(l, 1) = m(l, 1) + 1: m(l, 2) = m(l, 2) + X(i): m(l, 

m(2 , 3) = m(2, 3) + X(i) A 3: m(2, 4) = m(2, 4) + X(i 

m(3, 3) = m(3, 3) + X(i) A 4: m(3, 4) = m(3, 4) + X(i 

Next i 

m(2, 1) = m(l, 2): m(2, 2) = m(l, 3): m(3, 1) = m(l, 3): m(3, 2) = m(2, 3) 

'Solve matrix using Gaussian elimination 
For i = 1 To 2 

For j = i + 1 To 3 

For k = i + 1 To 4 

m(j, k) = m(j, k) - m(i, k) * m(j, i) / m(i, i) 

Next k 
Next j 
Next i 

m(3, 4) = m(3, 4) / m(3, 3) 

m(2, 4) = (m(2, 4) - m(2, 3) * m(3, 4)) / m(2, 2) 

m(l, 4) = (m(l, 4) - m(l, 2) * m(2, 4) - m(l, 3) * m(3, 4)) / m(l, 1) 

'm(l,4), m(2,4), and m(3,4) are the coefficients of regression in Y=m(l , 4)+m(2,4)*X+m(3 , 4)*XA2 

'Calculate X-value for the desired alpha and use the polynomial to find its Y-value. 

X(0) = -1 / (1 + 1.5 * Application.WorksheetFunction. NormSlnvCl - a / 2)) 

Y(0) = m(l, 4) + m(2, 4) * X(0) + m(3, 4) * X(0) A 2 

'Calculate the interpolated/extrapolated Q-value 

QTAB = (Exp(Y(0)) - r / v) * Sqr(2) * Application.WorksheetFunction.Tlnv(a, v) 

Done: 

'Adjust Q-value if v>100, a>0.025 and if r>=20 
If v > 100 And a > 0.025 Then 
QTAB = QTAB / (1 - 0.007) 

End If 

If r > 19 Then 

QTAB = QTAB / (1 + ((0.0000769 * Log(a) + 0.000605) * r + (-0.000625 * Log(a) - 0.00976))) 

End If 

End Function 


Figure 3. The user-defined function QTAB(a,r,i/) implemented in a Visual Basic module of Excel. Note 


3) = m(l, 3) + X(i) A 2: m(l, 4) = m(l, 4) + Y(i) 
) * YCO 
) A 2 * Y(i) 


that the natural logarithm function in Excel Visual Basic is Log. 
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Results 

The accuracy of the QTRNGO (Figure 1) and QTAB (Figure 3) functions is shown in Table 2. As is noted, 
the QTAB function was considerably more accurate with less than 0.7% absolute mean error and could 
be used over a very large range of a-values. The difference in fit between QTRNGO and QTAB is also 
shown in Figure 4 for a = 0.05. Again, the improvement in accuracy by QTAB over QTRNGO is clear. As 
far as computational efficiency, the 960 a-values used for error calculations of the interpolation and 
extrapolation results in Table 2 took 489 s to calculate with the RealStats-2007's add-in, which 
presumably uses the Copenhaver-Flolland technique (Zaiontz (2007)). The same 960 calculations with 
QTRNGO and QTAB took 1.5 and 2.5 s, respectively. If a higher accuracy is desired, a more complex user- 
defined function has also been published by Klasson (2018), which took a different approach than the 
one developed here 

It is anticipated that this user-defined function will help researchers and student who want to use Excel 
to conduct fast multiple comparisons procedures after initial ANOVA tests. The accuracy, computational 
efficiency, large usable range of variables, and simple programing of the method are beneficial. The 
QTAB function forces a #VALUE error if r < 2, r > 100, v < r, a < 0.0001, or a > 0.5. 
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Table 2. Accuracy of two Excel user-defined functions for Q a , r , v . The accurate values for the top half of 


the table was from Gleason (1998) and the accurate values for the bottom half of the table was 
generated by the QINV function of RealStat-2007 (Zaiontz (2007)). 




QTRNGO 



QTAB 



Max. 

Min. 

Abs. Mean 

Max. 

Min. 

Abs. Mean 

a 

Error 

Error 

Error 

Error 

Error 

Error 

0.5 

0.0% 

-36.6% 

29.6% a 

1.4% 

-1.4% 

0.42% 

0.25 

0.0% 

-14.8% 

29.6% a 

0.9% 

-1.3% 

0.32% 

0.1 

4.4% 

-8.5% 

2.0% 

1.0% 

-0.9% 

0.25% 

0.05 

7.7% 

-13.6% 

1.0% 

1.1% 

-0.9% 

0.27% 

0.025 

7.5% 

-19.4% 

1.2% 

1.1% 

-0.9% 

0.33% 

0.01 

3.8% 

-27.8% 

1.5% 

1.2% 

-0.9% 

0.33% 

0.005 

0.0% 

-34.5% 

3.7% a 

1.2% 

-0.9% 

0.34% 

0.001 

0.0% 

-49.5% 

3.7% a 

1.3% 

-0.8% 

0.37% 




Interpolation and Extrapolation 



0.15 

0.0% 

-7.2% 

4.8% 

0.8% 

-1.1% 

0.30% 

0.075 

6.3% 

-11% 

1.0% 

1.0% 

-0.8% 

0.26% 

0.0001 

0.0% 

-66% 

25% a 

5.4% b 

-2.2% 

0.69% 


a QTRNGO (AS 190.2) is not recommended for these a-values (Lund and Lund (1983)). b The Excel add-in 
RealStats-2007 reported an unreasonable low value for Qo. 0001 , 2,2 and this data point was removed from 
the error calculations. The web site for the RealStats-2007 add-in does not recommend calculating Q for 
low a-values (Zaiontz (2007)) which may be the cause for the slightly greater errors for a = 0.0001. 
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Figure 4. The accuracy of the two fast estimation methods for Q when a = 0.05. The values from the 

statistical table were taken from Gleason (1998). 
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